Skip to content

Conversation

dnandakumar-nv
Copy link
Contributor

@dnandakumar-nv dnandakumar-nv commented Jul 7, 2025

Description

📋 Summary

This PR adds comprehensive support for OpenAI's Responses API to the NeMo Agent toolkit, enabling users to leverage built-in tools (like Code Interpreter) and remote tools via Model Context Protocol (MCP) alongside existing NAT tools.

✨ What's New

Responses API Agent Implementation

  • Native Responses API Clients - New api_type config element for LLMs and supported instantiation of responses API type clients from supported packages. Validation of no-support for others.
  • New agent type: responses_api_agent that integrates with OpenAI's Responses API
  • Multi-tool support: Combines NAT tools, OpenAI built-in tools, and MCP remote tools in a single workflow
  • Streaming support: Full compatibility with Responses API streaming capabilities
  • Error handling: Configurable tool error handling with graceful fallbacks

Key Features

  • Built-in tool integration: Direct support for Code Interpreter, file search, and image generation tools
  • MCP protocol support: Connect to remote MCP servers with flexible configuration options
  • Tool binding: Automatic tool binding to LLM with strict=True and optional parallel tool calls
  • Validation: Runtime validation ensuring LLM supports Responses API

🔧 Configuration

Example Configuration

llms:
  openai_llm:
    _type: openai
    model_name: gpt-5-mini-2025-08-07
    api_type: responses  # Required for Responses API. Default is `chat_completions`

workflow:
  _type: responses_api_agent
  llm_name: openai_llm
  nat_tools: [current_datetime]
  builtin_tools:
    - type: code_interpreter
      container:
        type: "auto"
  mcp_tools:
    - type: mcp
      server_label: deepwiki
      server_url: https://mcp.deepwiki.com/mcp
      allowed_tools: [read_wiki_structure, read_wiki_contents]
      require_approval: never

Tool Types Supported

  • nat_tools: NeMo Agent toolkit tools (executed by agent graph)
  • builtin_tools: OpenAI built-in tools (Code Interpreter, file search, image generation)
  • mcp_tools: Remote tools via Model Context Protocol

📚 Documentation

Complete README Section

Added comprehensive documentation covering:

  • What is Responses API: Clear explanation of capabilities vs Chat Completions
  • Prerequisites: API key setup and model requirements
  • Usage examples: CLI commands, server setup, and curl requests
  • Configuration guide: Detailed explanation of all configuration options
  • Built-in tools: Code Interpreter setup and other available tools
  • MCP integration: Complete schema reference and examples
  • Troubleshooting: Common issues and solutions

Code Examples

# Run the agent
nat run --config_file=examples/agents/tool_calling/configs/config-responses-api.yml --input "How many 0s are in the current time?"

🏗️ Implementation Details

Files Added/Modified

  • src/nat/agent/responses_api_agent/register.py: Core agent implementation
  • src/nat/data_models/openai_mcp.py: MCP tool schema definitions
  • examples/agents/tool_calling/configs/config-responses-api.yml: Example configuration
  • examples/agents/tool_calling/README.md: Comprehensive documentation
  • .... more core library changes and tests

Technical Architecture

  • New support for api_type in LLM configurations: One of chat_completions or responses.
  • New clients for several packages that supports responses API for use in any workflow
  • New Agent Leverages existing ToolCallAgentGraph for consistent agent behavior
  • Binds built-in and MCP tools directly to LLM for Responses API execution
  • Maintains separation between NAT tools (graph-executed) and API tools (LLM-executed)
  • Supports configurable recursion limits and error handling

✅ Testing

Manual Testing

  1. Set OpenAI API key: export OPENAI_API_KEY=<key>
  2. Run example: nat run --config_file=examples/agents/tool_calling/configs/config-responses-api.yml --input "How many 0s are in the current time?"
  3. Verify Code Interpreter works: Ask for mathematical calculations or data analysis
  4. Test MCP integration: Configure a public MCP server and verify tool calls

Validation

  • ✅ Linting passes (Vale, etc.)
  • ✅ Configuration validation works
  • ✅ Error handling for unsupported models
  • ✅ Documentation formatting correct

🚀 Usage

After merging, users can:

  1. Configure any supported model with api_type: responses
  2. Enable Code Interpreter for data analysis and computation
  3. Connect to MCP servers for extended functionality
  4. Combine multiple tool types in a single workflow
  5. Use streaming responses for real-time interaction

🔄 Backward Compatibility

  • ✅ No breaking changes to existing tool calling agent
  • ✅ New agent type is opt-in via configuration
  • ✅ Existing examples and workflows unchanged
  • ✅ Follows established NAT toolkit patterns

📖 Related Documentation

By Submitting this PR I confirm:

  • I am familiar with the Contributing Guidelines.
  • We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
    • Any contribution which contains commits that are not Signed-Off will not be accepted.
  • When the PR is ready for review, new or existing tests cover these changes.
  • When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

  • New Features

    • Added Responses API agent workflow for tool-calling automation using OpenAI's Responses API.
    • Added configuration support for Responses API (api_type parameter) across multiple frameworks.
    • Added MCP (Model Context Protocol) and built-in tool integration for agents.
  • Documentation

    • New documentation and README sections for Responses API agent configuration and usage.
  • Enhancements

    • Extended token usage tracking to include cached and reasoning tokens.
    • Added structured tool output tracking for workflow tracing.
  • Chores

    • Updated LLama-Index framework dependencies for improved compatibility.

Upgraded multiple dependencies, including `llama-index`, `llama-index-core`, and `langchain-openai` to newer versions for compatibility and stability improvements. Adjusted `uv.lock` to reflect updated dependency revisions and metadata changes.

Signed-off-by: dnandakumar-nv <[email protected]>
Introduce `APITypeEnum` to define supported API types and add `api_type` field in `LLMBaseConfig` with default and schema extras. Update LangChain plugin to validate API type for NVIDIA and AWS Bedrock, and enable `responses` API type for OpenAI integration.

Signed-off-by: dnandakumar-nv <[email protected]>
This update introduces explicit checks for the `api_type` field in NVIDIA, OpenAI, and AWS Bedrock configurations. Unsupported API types now raise a `ValueError` with clear error messages, ensuring compatibility and preventing misconfigurations. Additionally, support for `OpenAIResponses` is added for OpenAI when the API type matches.

Signed-off-by: dnandakumar-nv <[email protected]>
Introduce an OpenAIMCPSchemaTool class and MCPApprovalRequiredEnum to define the OpenAI MCP schema. This includes fields for tool configuration, server information, and approval requirements, leveraging Pydantic models for validation.

Signed-off-by: dnandakumar-nv <[email protected]>
Introduce a new `ResponsesAPIAgentWorkflowConfig` and `responses_api_agent_workflow` function for managing an LLM-based ReAct agent that integrates tool calls, including AIQ, MCP, and built-in tools. The agent supports flexible configurations, detailed logging, and error handling, enhancing the extensibility of tool interaction workflows.

Signed-off-by: dnandakumar-nv <[email protected]>
Updated config handling to exclude "api_type" in model dumps across multiple plugins for improved consistency. Adjusted agent graph building to use `await` for proper asynchronous operation. These changes ensure better API compatibility and alignment with async programming standards.

Signed-off-by: dnandakumar-nv <[email protected]>
Updated the logic to handle cases where the response content is a list of dictionaries, ensuring compatibility and preventing errors. This change improves robustness and avoids potential crashes when extracting text from output messages.

Signed-off-by: dnandakumar-nv <[email protected]>
Added a try-except block to safely handle exceptions during tool schema extraction and log relevant errors. Also, adjusted logging for tool information to display objects directly instead of accessing their names.

Signed-off-by: dnandakumar-nv <[email protected]>
Updated MCP tools iteration for clarity by renaming variables in `register.py`. Added a default value for `headers` in `openai_mcp.py` to ensure consistent behavior when undefined.

Signed-off-by: dnandakumar-nv <[email protected]>
Improved logging messages with additional context for errors and adjusted code style for consistency across multiple files. Also fixed import order and formatting for better readability and maintainability.

Signed-off-by: dnandakumar-nv <[email protected]>
This commit introduces `config_responses_api.yml`, a configuration file defining tools, LLM settings, and workflow for a response API integration in the simple calculator example. It includes OpenAI's GPT-4.1 setup and tools like calculator operations and a code interpreter. This enhancement enables expanded functionality and better modularity.

Signed-off-by: dnandakumar-nv <[email protected]>
This change adds the required Apache 2.0 license header to the `__init__.py` file in the `responses_api_agent` module. It ensures compliance with licensing requirements and explicitly states the terms under which the code may be used.

Signed-off-by: dnandakumar-nv <[email protected]>
Updated llm_config usage to exclude the "api_type" field in model_dump across multiple LLM-related components. This ensures consistent handling of configuration objects and prevents unintended data inclusion. Also, made a minor docstring adjustment in ResponsesAPIAgentWorkflowConfig for clarity.

Signed-off-by: dnandakumar-nv <[email protected]>
Excluded "api_type" from metadata processing to improve clarity and relevance in documentation. Adjusted model and id key formatting in CrewAI and Agno LLM configurations for better readability.

Signed-off-by: dnandakumar-nv <[email protected]>
Remove skipping of the `api_type` field in metadata utilities to include it in documentation. Update tests to validate the new behavior, ensuring `api_type` is correctly handled for LLM configurations.

Signed-off-by: dnandakumar-nv <[email protected]>
Copy link

copy-pr-bot bot commented Jul 7, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@dnandakumar-nv dnandakumar-nv added enhancement non-breaking Non-breaking change improvement Improvement to existing functionality and removed enhancement labels Jul 7, 2025
Introduced `ServerToolUseSchema` to handle server-side tool output parsing. Updated callback handler to extract and store tool outputs from `message.additional_kwargs`. Minor exception handling adjustments were made for robustness in tool schema extraction and parsing processes.

Signed-off-by: dnandakumar-nv <[email protected]>
Introduce a logger to the LLM module for better debugging and observability. Added a warning log and enforced `stream=False` when using the OpenAI Responses API, as streaming is not supported in this mode.

Signed-off-by: dnandakumar-nv <[email protected]>
Introduced a logger to the Llama Index LLM module to improve traceability. Added a warning for OpenAIResponses API users about the lack of support for AIQ callback handlers and the absence of intermediate step logging.

Signed-off-by: dnandakumar-nv <[email protected]>
Introduced a `ServerToolUseSchema` model for handling tool usage data from server-side tool calls. Updated the `LlamaIndexProfilerHandler` to process and log `built_in_tool_calls` from response metadata. Simplified logging and removed unnecessary warnings in the LLM plugin code.

Signed-off-by: dnandakumar-nv <[email protected]>
Moved the `ServerToolUseSchema` class to `intermediate_step.py` for reusability across modules. Adjusted imports and removed redundant definitions in `langchain_callback_handler.py` and `llama_index_callback_handler.py`.

Signed-off-by: dnandakumar-nv <[email protected]>
Introduced `_validate_no_responses_api` to ensure LLM configurations do not use the unsupported Responses API for Semantic Kernel, CrewAI, and Agno connectors. Updated specific workflow registrations to integrate this validation logic, improving robustness and preventing misconfigurations.

Signed-off-by: dnandakumar-nv <[email protected]>
This update introduces comprehensive unit tests for multiple frameworks, including LLaMA-Index, CrewAI, LangChain, Semantic Kernel, and Agno. Key additions include validation for API types, parameter passthroughs, and decorator registrations, ensuring correctness and reliability of wrapper implementations.

Signed-off-by: dnandakumar-nv <[email protected]>
This update introduces comprehensive unit tests for multiple frameworks, including LLaMA-Index, CrewAI, LangChain, Semantic Kernel, and Agno. Key additions include validation for API types, parameter passthroughs, and decorator registrations, ensuring correctness and reliability of wrapper implementations.

Signed-off-by: dnandakumar-nv <[email protected]>
@dnandakumar-nv dnandakumar-nv marked this pull request as ready for review July 8, 2025 14:04
Added "raising-format-tuple" and "comparison-with-callable" to pylint disable directives to suppress unnecessary warnings. This ensures the code adheres to the desired linting standards without being cluttered by irrelevant alerts.
@dnandakumar-nv
Copy link
Contributor Author

/ok to test 06370c7

Replaces old `aiq` commands with updated `nat` commands and adjusts an example input. Improves clarity by formatting tool references and field details with consistent code styling. Refines descriptions of built-in and remote tools for better readability.
@dnandakumar-nv
Copy link
Contributor Author

/ok to test 9149a2d

Replaces old `aiq` commands with updated `nat` commands and adjusts an example input. Improves clarity by formatting tool references and field details with consistent code styling. Refines descriptions of built-in and remote tools for better readability.
@dnandakumar-nv
Copy link
Contributor Author

/ok to test 605de7b

This commit introduces a new documentation file explaining the integration of OpenAI's Responses API with the NeMo Agent toolkit. It details LLM and agent configuration, tool usage (built-in, NAT, and MCP), and configurable options, providing examples for setup and execution.
Introduce tests to validate Responses API agent functionality, including tool binding, LLM capability checks, and workflow execution. These ensure proper integration of NAT tools, MCP tools, and built-in tools with the agent.
@dnandakumar-nv
Copy link
Contributor Author

/ok to test e786c12

Updated the workflows documentation to include the Responses API and Agent. Adjusted test script to disable specific pylint rules related to unused arguments and non-async context managers.
@dnandakumar-nv
Copy link
Contributor Author

/ok to test 66ff758

dnandakumar-nv and others added 7 commits August 13, 2025 19:44
Updated the workflows documentation to include the Responses API and Agent. Adjusted test script to disable specific pylint rules related to unused arguments and non-async context managers.
…pport

# Conflicts:
#	examples/agents/tool_calling/README.md
#	src/nat/profiler/callbacks/llama_index_callback_handler.py
…pport

# Conflicts:
#	packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py
#	packages/nvidia_nat_llama_index/pyproject.toml
#	packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py
#	packages/nvidia_nat_semantic_kernel/src/nat/plugins/semantic_kernel/llm.py
#	uv.lock
Revised error messages to reflect specific LLM support for API types. Updated multiple dependencies including `llama-index` and `langchain-openai` to their latest versions for improved compatibility and features.

Signed-off-by: dnandakumar-nv <[email protected]>
# Conflicts:
#	docs/source/workflows/about/index.md
#	examples/frameworks/multi_frameworks/pyproject.toml
#	packages/nvidia_nat_agno/src/nat/plugins/agno/llm.py
#	packages/nvidia_nat_agno/tests/test_llm_agno.py
#	packages/nvidia_nat_crewai/src/nat/plugins/crewai/llm.py
#	packages/nvidia_nat_langchain/pyproject.toml
#	packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py
#	packages/nvidia_nat_llama_index/pyproject.toml
#	packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py
#	packages/nvidia_nat_semantic_kernel/src/nat/plugins/semantic_kernel/llm.py
#	src/nat/eval/utils/weave_eval.py
#	uv.lock
@dnandakumar-nv dnandakumar-nv requested a review from a team as a code owner October 16, 2025 14:24
Copy link

coderabbitai bot commented Oct 16, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

This PR introduces support for OpenAI's Responses API across multiple LLM frameworks, adding a new responses_api_agent workflow, API type enumeration, structured tool output tracking, and token usage metrics expansion. It includes framework-specific routing, validation logic, comprehensive test coverage, and documentation.

Changes

Cohort / File(s) Summary
Data Models & Enums
src/nat/data_models/llm.py, src/nat/data_models/openai_mcp.py, src/nat/data_models/intermediate_step.py, src/nat/eval/usage_stats.py
Added APITypeEnum with "chat_completion" and "responses" values; added api_type field to LLMBaseConfig; introduced OpenAIMCPSchemaTool and MCPApprovalRequiredEnum for MCP configuration; added ServerToolUseSchema for structured tool output; expanded UsageStatsLLM with cached_tokens and reasoning_tokens fields.
Responses API Agent Workflow
src/nat/agent/responses_api_agent/__init__.py, src/nat/agent/responses_api_agent/register.py, src/nat/agent/register.py
Introduced new ResponsesAPIAgentWorkflowConfig and responses_api_agent_workflow function; integrates NAT/MCP/builtin tools with LLM Responses API support; registered workflow in central agent registry.
AgNO LLM Integration
packages/nvidia_nat_agno/src/nat/plugins/agno/llm.py, packages/nvidia_nat_agno/tests/test_llm_agno.py
Added API type validation for NIM/LiteALLM backends (no Responses API); added conditional OpenAIResponses instantiation for OpenAI when api_type is RESPONSES; added test fixtures and test cases for Responses API paths.
CrewAI LLM Integration
packages/nvidia_nat_crewai/src/nat/plugins/crewai/llm.py, packages/nvidia_nat_crewai/tests/test_llm_crewai.py
Added validate_no_responses_api checks for NIM and OpenAI backends; introduced comprehensive test suite validating wrapper instantiation, API type enforcement, environment variable handling, and registry mapping.
LangChain LLM Integration
packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py, packages/nvidia_nat_langchain/tests/test_llm_langchain.py
Added API type validation for Bedrock/Azure/NVIDIA backends; conditional OpenAI client construction: uses_responses_api and use_previous_response_id when RESPONSES type detected; added extensive test coverage for all paths.
Llama-Index LLM Integration
packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py, packages/nvidia_nat_llama_index/tests/test_llm_llama_index.py, packages/nvidia_nat_llama_index/pyproject.toml
Added API type validation for AWS Bedrock/Azure backends; conditional OpenAIResponses instantiation for OpenAI RESPONSES path; bumped llama-index dependencies (core, llms-nvidia, llms-openai, llama-index); added new test module with full framework coverage.
Semantic Kernel LLM Integration
packages/nvidia_nat_semantic_kernel/src/nat/plugins/semantic_kernel/llm.py, packages/nvidia_nat_semantic_kernel/tests/test_llm_sk.py
Added validate_no_responses_api for AzureOpenAI and OpenAI backends; introduced comprehensive test module validating basic creation, Responses API blocking, and registration decorator.
Profiler & Callbacks
src/nat/profiler/callbacks/token_usage_base_model.py, src/nat/profiler/callbacks/langchain_callback_handler.py, src/nat/profiler/callbacks/llama_index_callback_handler.py, src/nat/eval/evaluate.py
Extended TokenUsageBaseModel with cached_tokens and reasoning_tokens; enhanced langchain callback to collect tool_outputs and extract new token types from message metadata; added llama_index token extraction helper and tool output assembly; updated usage stats accumulation in evaluate.py.
Utilities & Validation
src/nat/utils/responses_api.py
Added validate_no_responses_api utility function enforcing non-Responses API types for frameworks without support.
Examples & Documentation
examples/agents/tool_calling/configs/config-responses-api.yml, examples/agents/tool_calling/README.md, docs/source/workflows/about/index.md, docs/source/workflows/about/responses-api-and-agent.md
Added responses_api_agent workflow YAML configuration example; expanded tool calling README with Responses API section; added navigation entry in workflows index; created comprehensive Responses API documentation with setup, configuration, and MCP reference.
Tests
tests/nat/agent/test_responses_api_agent.py
Introduced agent workflow integration tests validating LLM Responses API support requirement, tool binding, and end-to-end execution.
Dependencies
examples/frameworks/multi_frameworks/pyproject.toml
Removed trailing comma from markdown-it-py dependency.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant Builder
    participant ResponsesAPIAgent as responses_api_agent_workflow
    participant LLM
    participant Tools
    participant Graph as ToolCallAgentGraph

    User->>ResponsesAPIAgent: Call workflow
    ResponsesAPIAgent->>LLM: Get LLM from builder
    LLM-->>ResponsesAPIAgent: Check api_type == RESPONSES
    alt api_type != RESPONSES
        ResponsesAPIAgent->>ResponsesAPIAgent: AssertionError
    end
    ResponsesAPIAgent->>Tools: Aggregate NAT, MCP, builtin tools
    ResponsesAPIAgent->>LLM: Bind tools to LLM
    LLM->>LLM: Store bound_tools, bound_parallel
    ResponsesAPIAgent->>Graph: Construct ToolCallAgentGraph
    ResponsesAPIAgent->>Graph: Compile and return _response_fn
    User->>ResponsesAPIAgent: Call response_fn(input)
    ResponsesAPIAgent->>Graph: Run with input as HumanMessage
    Graph->>LLM: Agent calls LLM with tools
    LLM->>Tools: Execute tool calls
    Tools-->>Graph: Return results
    Graph-->>ResponsesAPIAgent: Final agent state
    ResponsesAPIAgent-->>User: Extract and return output
Loading
sequenceDiagram
    participant Framework
    participant LLMPlugin as LLM Plugin
    participant APIValidation
    participant Client as LLM Client

    Framework->>LLMPlugin: Initialize LLM
    LLMPlugin->>APIValidation: Check api_type
    alt api_type == RESPONSES
        APIValidation->>LLMPlugin: Route to Responses path
        LLMPlugin->>Client: Instantiate OpenAIResponses
        Client->>Client: Enable use_responses_api
    else api_type == CHAT_COMPLETION
        APIValidation->>LLMPlugin: Route to standard path
        LLMPlugin->>Client: Instantiate standard client
    else Unsupported API type
        APIValidation->>LLMPlugin: Raise ValueError
    end
    Client-->>Framework: Client ready
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

This PR involves substantial, multi-layered changes across five LLM framework integrations (AgNO, CrewAI, LangChain, Llama-Index, Semantic Kernel), requiring separate reasoning for each framework's routing logic. Additional complexity stems from interconnected data models, new agent workflow implementation with tool aggregation, profiler enhancements for structured tool output tracking, and comprehensive test suites. While many changes follow a repeating pattern (API type validation and routing), each framework exhibits distinct integration points and conditional logic that demand careful verification. The mix of data models, utilities, agent logic, and callback instrumentation necessitates understanding multiple concerns in context.

Pre-merge checks and finishing touches

Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error)
Check name Status Explanation Resolution
Title Check ❌ Error The title succinctly describes the addition of a new agent and example and is within the recommended length, but it is phrased in the past tense rather than using the imperative mood as stipulated by the title guidelines. Please rephrase the title in the imperative mood, for example: “Add new agent and example utilizing the OpenAI Responses API.”
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai bot added the feature request New feature or request label Oct 16, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py (1)

80-91: System prompt injection may corrupt non-text SystemMessage content.

If SystemMessage.content is a structured list (multimodal), str() and content replacement will coerce it to a string and lose structure. Safer: only append when content is str; otherwise insert a new SystemMessage before the first message.

Apply this diff:

-                for i, message in enumerate(messages):
-                    if isinstance(message, SystemMessage):
-                        if self.system_prompt not in str(message.content):
-                            messages = list(messages)
-                            messages[i] = SystemMessage(content=f"{message.content}\n{self.system_prompt}")
-                        break
+                for i, message in enumerate(messages):
+                    if isinstance(message, SystemMessage):
+                        if isinstance(message.content, str):
+                            if self.system_prompt not in message.content:
+                                messages = list(messages)
+                                messages[i] = SystemMessage(content=f"{message.content}\n{self.system_prompt}")
+                        else:
+                            # preserve original structured content; insert a separate system message
+                            messages = list(messages)
+                            messages.insert(0, SystemMessage(content=self.system_prompt))
+                        break
♻️ Duplicate comments (3)
packages/nvidia_nat_semantic_kernel/src/nat/plugins/semantic_kernel/llm.py (1)

110-111: Same validation issue applies here.

See comment on lines 93-94 regarding the bug in validate_no_responses_api.

packages/nvidia_nat_crewai/src/nat/plugins/crewai/llm.py (2)

118-119: Verify the error handling in validate_no_responses_api.

The validation call is correctly placed. However, the same bug noted in the Semantic Kernel plugin applies here: the validate_no_responses_api function constructs ValueError incorrectly with two arguments instead of formatting the string.

See the comment on packages/nvidia_nat_semantic_kernel/src/nat/plugins/semantic_kernel/llm.py lines 93-94 for details and verification script.


139-140: Same validation issue applies here.

See comment on lines 118-119 regarding the bug in validate_no_responses_api.

🧹 Nitpick comments (30)
packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py (3)

86-89: Consider extracting error messages to exception subclasses.

The API type validation logic is correct. However, the error message is defined inline, which violates Ruff's TRY003 guideline. For better maintainability, consider creating custom exception classes or extracting these messages:

+class UnsupportedAPITypeError(ValueError):
+    """Raised when an unsupported API type is used for a backend."""
+    pass
+
 @register_llm_client(config_type=AWSBedrockModelConfig, wrapper_type=LLMFrameworkEnum.LLAMA_INDEX)
 async def aws_bedrock_llama_index(llm_config: AWSBedrockModelConfig, _builder: Builder):
 
     from llama_index.llms.bedrock import Bedrock
 
     if llm_config.api_type != APITypeEnum.CHAT_COMPLETION:
-        raise ValueError("AWS Bedrock LLM only supports chat completion API type. "
-                         f"Received: {llm_config.api_type}")
+        raise UnsupportedAPITypeError(
+            f"AWS Bedrock LLM only supports chat completion API type. Received: {llm_config.api_type}"
+        )

101-104: Same refactoring suggestion applies.

See comment on lines 86-89. The validation logic is correct, but consider extracting the error message to a custom exception class for consistency and maintainability.


115-118: Same refactoring suggestion applies.

See comment on lines 86-89. The validation logic is correct, but consider extracting the error message to a custom exception class for consistency and maintainability.

src/nat/data_models/intermediate_step.py (1)

106-112: Tighten schema and add docstring for ServerToolUseSchema.

  • arguments: the union ends with Any, making the type effectively Any. Simplify.
  • Add a short docstring (public model; docs required).

As per coding guidelines

 class ServerToolUseSchema(BaseModel):
-    name: str
-    arguments: str | dict[str, typing.Any] | typing.Any
-    output: typing.Any
-
-    model_config = ConfigDict(extra="ignore")
+    """Structured representation of a server-executed tool call."""
+
+    name: str
+    arguments: typing.Any
+    output: typing.Any
+
+    model_config = ConfigDict(extra="ignore")
examples/agents/tool_calling/configs/config-responses-api.yml (1)

20-25: Provide a test-stub variant for CI/offline runs.

Examples are great; for tests/CI, add a companion config (or commented lines) that sets llms.openai_llm._type: nat_test_llm with response_seq/delay_ms to comply with YAML stubbing guidance.

As per coding guidelines

packages/nvidia_nat_langchain/tests/test_llm_langchain.py (2)

61-61: Remove stray print in tests.

Avoid test output noise.

-            print(kwargs)

170-184: Strengthen registration test; current assertions are tautological.

You set _llm_client_map directly, then assert what you set. Import the module after patching GlobalTypeRegistry to validate decorator effects instead.

-@patch("nat.cli.type_registry.GlobalTypeRegistry")
-def test_decorator_registration(mock_global_registry):
-    """Ensure register_llm_client decorators registered the LangChain wrappers."""
-    registry = MagicMock()
-    mock_global_registry.get.return_value = registry
-
-    registry._llm_client_map = {
-        (NIMModelConfig, LLMFrameworkEnum.LANGCHAIN): nim_langchain,
-        (OpenAIModelConfig, LLMFrameworkEnum.LANGCHAIN): openai_langchain,
-        (AWSBedrockModelConfig, LLMFrameworkEnum.LANGCHAIN): aws_bedrock_langchain,
-    }
-
-    assert registry._llm_client_map[(NIMModelConfig, LLMFrameworkEnum.LANGCHAIN)] is nim_langchain
-    assert registry._llm_client_map[(OpenAIModelConfig, LLMFrameworkEnum.LANGCHAIN)] is openai_langchain
-    assert registry._llm_client_map[(AWSBedrockModelConfig, LLMFrameworkEnum.LANGCHAIN)] is aws_bedrock_langchain
+@patch("nat.cli.type_registry.GlobalTypeRegistry")
+def test_decorator_registration(mock_global_registry):
+    """Ensure register_llm_client decorators registered the LangChain wrappers."""
+    registry = MagicMock()
+    mock_global_registry.get.return_value = registry
+
+    # Import after patching so decorators register against our mock registry
+    import importlib
+    importlib.import_module("nat.plugins.langchain.llm")
+
+    # Validate that registrations occurred
+    calls = [c.args for c in registry.register_llm_client.call_args_list]
+    assert (NIMModelConfig, LLMFrameworkEnum.LANGCHAIN) in [args[:2] for args in calls]
+    assert (OpenAIModelConfig, LLMFrameworkEnum.LANGCHAIN) in [args[:2] for args in calls]
+    assert (AWSBedrockModelConfig, LLMFrameworkEnum.LANGCHAIN) in [args[:2] for args in calls]
src/nat/profiler/callbacks/llama_index_callback_handler.py (1)

203-215: Nit: fix comment branding and ensure tool_outputs type matches.

  • Update “AIQ Toolkit” to “NAT”.
  • Ensure TraceMetadata.tool_outputs remains a list (paired with earlier type fix).
-                # Append usage data to AIQ Toolkit usage stats
+                # Append usage data to NAT usage stats
                 with self._lock:
                     stats = IntermediateStepPayload(
                         event_type=IntermediateStepType.LLM_END,
                         span_event_timestamp=self._run_id_to_timestamp.get(event_id),
                         framework=LLMFrameworkEnum.LLAMA_INDEX,
                         name=model_name,
                         UUID=event_id,
                         data=StreamEventData(input=self._run_id_to_llm_input.get(event_id), output=llm_text_output),
-                        metadata=TraceMetadata(chat_responses=response.message if response.message else None,
-                                               tool_outputs=tool_outputs_list if tool_outputs_list else []),
+                        metadata=TraceMetadata(
+                            chat_responses=response.message if response.message else None,
+                            tool_outputs=tool_outputs_list if tool_outputs_list else [],
+                        ),
                         usage_info=UsageInfo(token_usage=self._extract_token_usage(response)))
packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py (3)

120-123: Tighten/standardize validation error messages; align with Ruff TRY003.

Current ValueError strings are long. Consider shorter, consistent messages (and/or a custom exception) to satisfy ruff TRY003 and reduce noise.

Example:

-    if llm_config.api_type != APITypeEnum.CHAT_COMPLETION:
-        raise ValueError("AWS Bedrock only supports chat completion API type. "
-                         f"Received: {llm_config.api_type}")
+    if llm_config.api_type != APITypeEnum.CHAT_COMPLETION:
+        raise ValueError(f"AWS Bedrock: unsupported api_type={llm_config.api_type}; expected chat_completions")

Repeat similarly for Azure OpenAI and NVIDIA AI Endpoints. Based on static analysis hints.

Also applies to: 139-141, 152-155


191-198: Consider explicit api_type validation for LiteLLM wrapper.

To prevent ambiguous behavior across providers via LiteLLM, validate api_type (or document accepted values) similar to other wrappers.


115-131: Add minimal docstrings and return types for public registered clients.

Registered functions are public API; add concise Google-style docstrings and precise return types (AsyncIterator[...] or concrete Chat model) per coding guidelines.

Also applies to: 133-145, 147-163, 165-189, 191-198

tests/nat/agent/test_responses_api_agent.py (2)

30-37: Align mock Builder interface: make get_tools async.

Real Builder.get_tools is async; tests using a sync stub can mask bugs. Update to async to match the interface.

Apply this diff:

-    def get_tools(self, tool_names, wrapper_type):
+    async def get_tools(self, tool_names, wrapper_type):
         # match interface and avoid unused warnings
         return self._tools

46-55: Remove unused noqa and simplify setattr.

  • The noqa: D401 is unused.
  • Direct attribute assignment on class is clearer in tests.

Apply this diff:

-    def bind_tools(self, *, tools, parallel_tool_calls=False, strict=True):  # noqa: D401
+    def bind_tools(self, *, tools, parallel_tool_calls=False, strict=True):
         # Store on class to avoid Pydantic instance attribute restrictions
-        klass = type(self)
-        setattr(klass, "bound_tools", tools)
-        setattr(klass, "bound_parallel", parallel_tool_calls)
-        setattr(klass, "bound_strict", strict)
+        klass = type(self)
+        klass.bound_tools = tools
+        klass.bound_parallel = parallel_tool_calls
+        klass.bound_strict = strict
         return self
src/nat/agent/responses_api_agent/register.py (4)

116-121: Logging: simplify logger.exception usage and avoid echoing ex twice.

logger.exception already logs the stack; passing exc_info and formatting ex is redundant per guidelines.

Apply this diff:

-        except Exception as ex:
-            logger.exception("%s Tool Calling Agent failed with exception: %s", AGENT_LOG_PREFIX, ex, exc_info=ex)
+        except Exception:
+            logger.exception("%s Tool Calling Agent failed", AGENT_LOG_PREFIX)
             if config.verbose:
-                return str(ex)
+                return "Tool Calling Agent failed; see logs for details."
             return "I seem to be having a problem."

122-127: Use non-exception log for normal generator close; fix workflow name.

GeneratorExit isn’t an error; prefer debug/info and correct the workflow name.

Apply this diff:

-    except GeneratorExit:
-        logger.exception("%s Workflow exited early!", AGENT_LOG_PREFIX, exc_info=True)
+    except GeneratorExit:
+        logger.debug("%s Workflow exited early.", AGENT_LOG_PREFIX)
     finally:
-        logger.debug("%s Cleaning up react_agent workflow.", AGENT_LOG_PREFIX)
+        logger.debug("%s Cleaning up responses_api_agent workflow.", AGENT_LOG_PREFIX)

34-58: Small typos and clarity in config fields/docstring.

  • “an NeMo” → “a NeMo”
  • “inbetween” → “in between”
  • “stoping” → “stopping”
  • “functions use” → “function’s use”

Apply this diff:

-    Defines an NeMo Agent Toolkit function that uses a Responses API
-    Agent performs reasoning inbetween tool calls, and utilizes the
+    Defines a NeMo Agent toolkit function that uses the Responses API.
+    The agent performs reasoning in between tool calls and uses the
     tool names and descriptions to select the optimal tool.
@@
-    max_iterations: int = Field(default=15, description="Number of tool calls before stoping the agent.")
-    description: str = Field(default="Agent Workflow", description="The description of this functions use.")
+    max_iterations: int = Field(default=15, description="Number of tool calls before stopping the agent.")
+    description: str = Field(default="Agent Workflow", description="The description of this function’s use.")

60-66: Add a concise docstring to the public workflow function.

Comply with docs guidelines; summarize inputs/outputs.

Example:

async def responses_api_agent_workflow(...):
    """Responses API agent workflow binding tools to an LLM and executing a tool-calling graph."""
src/nat/profiler/callbacks/langchain_callback_handler.py (2)

52-60: Narrow exception and include context at debug level.

Catching broad Exception is okay here but consider logging the exception type for visibility.

Apply this diff:

-            except Exception:
-                logger.debug(
+            except Exception as e:
+                logger.debug(
                     "Failed to parse tool schema from invocation params: %s. \n This "
                     "can occur when the LLM server has native tools and can be ignored if "
                     "using the responses API.",
-                    tool)
+                    tool,
+                )

254-264: Avoid silent swallow when parsing tool_outputs; log at debug.

Currently try/except/pass hides parsing issues.

Apply this diff:

-                for tool in tools_outputs:
-                    try:
-                        tool_outputs_list.append(ServerToolUseSchema(**tool))
-                    except Exception:
-                        pass
+                for tool in tools_outputs:
+                    try:
+                        tool_outputs_list.append(ServerToolUseSchema(**tool))
+                    except Exception as e:
+                        logger.debug("Failed to parse tool_outputs item: %r (%s)", tool, e)
docs/source/workflows/about/reponses-api-and-agent.md (3)

20-25: Use official product naming on first mention.

Per docs guidelines: first use “NVIDIA NeMo Agent toolkit”, then “NeMo Agent toolkit”.

Apply this diff:

-The NeMo Agent toolkit supports OpenAI's Responses API through two complementary pieces:
+The NVIDIA NeMo Agent toolkit supports OpenAI's Responses API through two complementary pieces:

As per coding guidelines.


29-34: Avoid “NAT” abbreviation in docs; keep it to code identifiers.

Use “toolkit tools” in prose; keep nat_tools only in code blocks.

Apply this diff:

-- **NAT Tools**: Continue to use toolkit tools through `nat_tools` (executed by the agent graph).
+- **Toolkit tools**: Continue to use toolkit tools through the `nat_tools` field (executed by the agent graph).
@@
-  # NAT tools are executed by the agent graph
+  # Toolkit tools (nat_tools) are executed by the agent graph

Based on coding guidelines.

Also applies to: 84-89


51-55: Minor grammar/clarity nits.

  • “stoping”→“stopping” already addressed elsewhere, but ensure consistency in docs.
  • Consider rephrasing “Used when the workflow is exported as a function.” to “Used when exporting the workflow as a function.”

Also applies to: 102-113

packages/nvidia_nat_crewai/tests/test_llm_crewai.py (2)

44-50: Add return type hints to fixtures.

Aligns tests with our typing guideline.

Apply:

 @pytest.fixture
-def nim_cfg(self):
+def nim_cfg(self) -> NIMModelConfig:
     return NIMModelConfig(model_name="test-nim")

 @pytest.fixture
-def nim_cfg_responses(self):
+def nim_cfg_responses(self) -> NIMModelConfig:
     return NIMModelConfig(model_name="test-nim", api_type=APITypeEnum.RESPONSES)

As per coding guidelines.


140-147: Avoid asserting on private registry attributes.

Accessing _llm_client_map couples tests to internals. Prefer a public accessor or registration API if available.

packages/nvidia_nat_semantic_kernel/tests/test_llm_sk.py (2)

40-47: Add return type hints to fixtures.

Keep tests consistently typed.

Apply:

 @pytest.fixture
-def oa_cfg(self):
+def oa_cfg(self) -> OpenAIModelConfig:
     return OpenAIModelConfig(model_name="gpt-4o")

 @pytest.fixture
-def oa_cfg_responses(self):
+def oa_cfg_responses(self) -> OpenAIModelConfig:
     # Using the RESPONSES API must be rejected by the wrapper.
     return OpenAIModelConfig(model_name="gpt-4o", api_type=APITypeEnum.RESPONSES)

As per coding guidelines.


77-82: Prefer public registry access over private map.

Testing via _llm_client_map is brittle. Use a public getter/registration method if exposed.

packages/nvidia_nat_llama_index/tests/test_llm_llama_index.py (4)

40-51: Type fixtures for Nim tests.

Add return type hints to match guidelines.

Apply:

 @pytest.fixture
-def mock_builder(self):
+def mock_builder(self) -> Builder:
     return MagicMock(spec=Builder)

 @pytest.fixture
-def nim_cfg(self):
+def nim_cfg(self) -> NIMModelConfig:
     return NIMModelConfig(model_name="nemotron-3b")

 @pytest.fixture
-def nim_cfg_bad_api(self):
+def nim_cfg_bad_api(self) -> NIMModelConfig:
     return NIMModelConfig(model_name="nemotron-3b", api_type=APITypeEnum.RESPONSES)

As per coding guidelines.


82-89: Type fixtures for OpenAI tests.

Apply:

 @pytest.fixture
-def oa_cfg_chat(self):
+def oa_cfg_chat(self) -> OpenAIModelConfig:
     return OpenAIModelConfig(model_name="gpt-4o", base_url=None)

 @pytest.fixture
-def oa_cfg_responses(self):
+def oa_cfg_responses(self) -> OpenAIModelConfig:
     return OpenAIModelConfig(model_name="gpt-4o", api_type=APITypeEnum.RESPONSES, temperature=0.1)

As per coding guidelines.


123-130: Type fixtures for Bedrock tests.

Apply:

 @pytest.fixture
-def br_cfg(self):
+def br_cfg(self) -> AWSBedrockModelConfig:
     return AWSBedrockModelConfig(model_name="ai21.j2-ultra")

 @pytest.fixture
-def br_cfg_bad_api(self):
+def br_cfg_bad_api(self) -> AWSBedrockModelConfig:
     return AWSBedrockModelConfig(model_name="ai21.j2-ultra", api_type=APITypeEnum.RESPONSES)

As per coding guidelines.


151-165: Registration test uses private attribute.

Consider asserting via public registry APIs to reduce brittleness.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d0ac939 and 4fc8080.

📒 Files selected for processing (29)
  • docs/source/workflows/about/index.md (1 hunks)
  • docs/source/workflows/about/reponses-api-and-agent.md (1 hunks)
  • examples/agents/tool_calling/README.md (2 hunks)
  • examples/agents/tool_calling/configs/config-responses-api.yml (1 hunks)
  • examples/frameworks/multi_frameworks/pyproject.toml (1 hunks)
  • packages/nvidia_nat_agno/src/nat/plugins/agno/llm.py (4 hunks)
  • packages/nvidia_nat_agno/tests/test_llm_agno.py (6 hunks)
  • packages/nvidia_nat_crewai/src/nat/plugins/crewai/llm.py (3 hunks)
  • packages/nvidia_nat_crewai/tests/test_llm_crewai.py (1 hunks)
  • packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py (7 hunks)
  • packages/nvidia_nat_langchain/tests/test_llm_langchain.py (1 hunks)
  • packages/nvidia_nat_llama_index/pyproject.toml (1 hunks)
  • packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py (5 hunks)
  • packages/nvidia_nat_llama_index/tests/test_llm_llama_index.py (1 hunks)
  • packages/nvidia_nat_semantic_kernel/src/nat/plugins/semantic_kernel/llm.py (3 hunks)
  • packages/nvidia_nat_semantic_kernel/tests/test_llm_sk.py (1 hunks)
  • src/nat/agent/register.py (1 hunks)
  • src/nat/agent/responses_api_agent/__init__.py (1 hunks)
  • src/nat/agent/responses_api_agent/register.py (1 hunks)
  • src/nat/data_models/intermediate_step.py (1 hunks)
  • src/nat/data_models/llm.py (1 hunks)
  • src/nat/data_models/openai_mcp.py (1 hunks)
  • src/nat/eval/evaluate.py (1 hunks)
  • src/nat/eval/usage_stats.py (1 hunks)
  • src/nat/profiler/callbacks/langchain_callback_handler.py (6 hunks)
  • src/nat/profiler/callbacks/llama_index_callback_handler.py (4 hunks)
  • src/nat/profiler/callbacks/token_usage_base_model.py (1 hunks)
  • src/nat/utils/responses_api.py (1 hunks)
  • tests/nat/agent/test_responses_api_agent.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (19)
**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

  • src/nat/agent/responses_api_agent/__init__.py
  • src/nat/eval/evaluate.py
  • src/nat/utils/responses_api.py
  • src/nat/eval/usage_stats.py
  • src/nat/profiler/callbacks/token_usage_base_model.py
  • packages/nvidia_nat_crewai/src/nat/plugins/crewai/llm.py
  • src/nat/agent/register.py
  • packages/nvidia_nat_langchain/tests/test_llm_langchain.py
  • src/nat/data_models/intermediate_step.py
  • src/nat/agent/responses_api_agent/register.py
  • examples/agents/tool_calling/configs/config-responses-api.yml
  • src/nat/profiler/callbacks/llama_index_callback_handler.py
  • tests/nat/agent/test_responses_api_agent.py
  • src/nat/data_models/openai_mcp.py
  • packages/nvidia_nat_crewai/tests/test_llm_crewai.py
  • packages/nvidia_nat_semantic_kernel/src/nat/plugins/semantic_kernel/llm.py
  • packages/nvidia_nat_llama_index/tests/test_llm_llama_index.py
  • src/nat/profiler/callbacks/langchain_callback_handler.py
  • packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py
  • packages/nvidia_nat_semantic_kernel/tests/test_llm_sk.py
  • packages/nvidia_nat_agno/src/nat/plugins/agno/llm.py
  • packages/nvidia_nat_agno/tests/test_llm_agno.py
  • src/nat/data_models/llm.py
  • packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

  • src/nat/agent/responses_api_agent/__init__.py
  • src/nat/eval/evaluate.py
  • src/nat/utils/responses_api.py
  • src/nat/eval/usage_stats.py
  • src/nat/profiler/callbacks/token_usage_base_model.py
  • packages/nvidia_nat_crewai/src/nat/plugins/crewai/llm.py
  • src/nat/agent/register.py
  • packages/nvidia_nat_langchain/tests/test_llm_langchain.py
  • src/nat/data_models/intermediate_step.py
  • src/nat/agent/responses_api_agent/register.py
  • src/nat/profiler/callbacks/llama_index_callback_handler.py
  • tests/nat/agent/test_responses_api_agent.py
  • src/nat/data_models/openai_mcp.py
  • packages/nvidia_nat_crewai/tests/test_llm_crewai.py
  • packages/nvidia_nat_semantic_kernel/src/nat/plugins/semantic_kernel/llm.py
  • packages/nvidia_nat_llama_index/tests/test_llm_llama_index.py
  • src/nat/profiler/callbacks/langchain_callback_handler.py
  • packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py
  • packages/nvidia_nat_semantic_kernel/tests/test_llm_sk.py
  • packages/nvidia_nat_agno/src/nat/plugins/agno/llm.py
  • packages/nvidia_nat_agno/tests/test_llm_agno.py
  • src/nat/data_models/llm.py
  • packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py
src/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All importable Python code must live under src/ (or packages//src/)

Files:

  • src/nat/agent/responses_api_agent/__init__.py
  • src/nat/eval/evaluate.py
  • src/nat/utils/responses_api.py
  • src/nat/eval/usage_stats.py
  • src/nat/profiler/callbacks/token_usage_base_model.py
  • src/nat/agent/register.py
  • src/nat/data_models/intermediate_step.py
  • src/nat/agent/responses_api_agent/register.py
  • src/nat/profiler/callbacks/llama_index_callback_handler.py
  • src/nat/data_models/openai_mcp.py
  • src/nat/profiler/callbacks/langchain_callback_handler.py
  • src/nat/data_models/llm.py
src/nat/**/*

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Changes in src/nat should prioritize backward compatibility

Files:

  • src/nat/agent/responses_api_agent/__init__.py
  • src/nat/eval/evaluate.py
  • src/nat/utils/responses_api.py
  • src/nat/eval/usage_stats.py
  • src/nat/profiler/callbacks/token_usage_base_model.py
  • src/nat/agent/register.py
  • src/nat/data_models/intermediate_step.py
  • src/nat/agent/responses_api_agent/register.py
  • src/nat/profiler/callbacks/llama_index_callback_handler.py
  • src/nat/data_models/openai_mcp.py
  • src/nat/profiler/callbacks/langchain_callback_handler.py
  • src/nat/data_models/llm.py

⚙️ CodeRabbit configuration file

This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.

Files:

  • src/nat/agent/responses_api_agent/__init__.py
  • src/nat/eval/evaluate.py
  • src/nat/utils/responses_api.py
  • src/nat/eval/usage_stats.py
  • src/nat/profiler/callbacks/token_usage_base_model.py
  • src/nat/agent/register.py
  • src/nat/data_models/intermediate_step.py
  • src/nat/agent/responses_api_agent/register.py
  • src/nat/profiler/callbacks/llama_index_callback_handler.py
  • src/nat/data_models/openai_mcp.py
  • src/nat/profiler/callbacks/langchain_callback_handler.py
  • src/nat/data_models/llm.py
{src/**/*.py,packages/*/src/**/*.py}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All public APIs must have Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful

Files:

  • src/nat/agent/responses_api_agent/__init__.py
  • src/nat/eval/evaluate.py
  • src/nat/utils/responses_api.py
  • src/nat/eval/usage_stats.py
  • src/nat/profiler/callbacks/token_usage_base_model.py
  • packages/nvidia_nat_crewai/src/nat/plugins/crewai/llm.py
  • src/nat/agent/register.py
  • src/nat/data_models/intermediate_step.py
  • src/nat/agent/responses_api_agent/register.py
  • src/nat/profiler/callbacks/llama_index_callback_handler.py
  • src/nat/data_models/openai_mcp.py
  • packages/nvidia_nat_semantic_kernel/src/nat/plugins/semantic_kernel/llm.py
  • src/nat/profiler/callbacks/langchain_callback_handler.py
  • packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py
  • packages/nvidia_nat_agno/src/nat/plugins/agno/llm.py
  • src/nat/data_models/llm.py
  • packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

  • Confirm that copyright years are up-to date whenever a file is changed.

Files:

  • src/nat/agent/responses_api_agent/__init__.py
  • examples/frameworks/multi_frameworks/pyproject.toml
  • src/nat/eval/evaluate.py
  • src/nat/utils/responses_api.py
  • src/nat/eval/usage_stats.py
  • src/nat/profiler/callbacks/token_usage_base_model.py
  • examples/agents/tool_calling/README.md
  • packages/nvidia_nat_crewai/src/nat/plugins/crewai/llm.py
  • src/nat/agent/register.py
  • packages/nvidia_nat_langchain/tests/test_llm_langchain.py
  • src/nat/data_models/intermediate_step.py
  • docs/source/workflows/about/reponses-api-and-agent.md
  • src/nat/agent/responses_api_agent/register.py
  • examples/agents/tool_calling/configs/config-responses-api.yml
  • src/nat/profiler/callbacks/llama_index_callback_handler.py
  • tests/nat/agent/test_responses_api_agent.py
  • src/nat/data_models/openai_mcp.py
  • packages/nvidia_nat_crewai/tests/test_llm_crewai.py
  • docs/source/workflows/about/index.md
  • packages/nvidia_nat_semantic_kernel/src/nat/plugins/semantic_kernel/llm.py
  • packages/nvidia_nat_llama_index/tests/test_llm_llama_index.py
  • src/nat/profiler/callbacks/langchain_callback_handler.py
  • packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py
  • packages/nvidia_nat_semantic_kernel/tests/test_llm_sk.py
  • packages/nvidia_nat_agno/src/nat/plugins/agno/llm.py
  • packages/nvidia_nat_llama_index/pyproject.toml
  • packages/nvidia_nat_agno/tests/test_llm_agno.py
  • src/nat/data_models/llm.py
  • packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py
examples/**/*

⚙️ CodeRabbit configuration file

examples/**/*: - This directory contains example code and usage scenarios for the toolkit, at a minimum an example should
contain a README.md or file README.ipynb.

  • If an example contains Python code, it should be placed in a subdirectory named src/ and should
    contain a pyproject.toml file. Optionally, it might also contain scripts in a scripts/ directory.
  • If an example contains YAML files, they should be placed in a subdirectory named configs/. - If an example contains sample data files, they should be placed in a subdirectory named data/, and should
    be checked into git-lfs.

Files:

  • examples/frameworks/multi_frameworks/pyproject.toml
  • examples/agents/tool_calling/README.md
  • examples/agents/tool_calling/configs/config-responses-api.yml
**/README.@(md|ipynb)

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Ensure READMEs follow the naming convention; avoid deprecated names; use “NeMo Agent Toolkit” (capital T) in headings

Files:

  • examples/agents/tool_calling/README.md
packages/*/src/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Importable Python code inside packages must live under packages//src/

Files:

  • packages/nvidia_nat_crewai/src/nat/plugins/crewai/llm.py
  • packages/nvidia_nat_semantic_kernel/src/nat/plugins/semantic_kernel/llm.py
  • packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py
  • packages/nvidia_nat_agno/src/nat/plugins/agno/llm.py
  • packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py
packages/**/*

⚙️ CodeRabbit configuration file

packages/**/*: - This directory contains optional plugin packages for the toolkit, each should contain a pyproject.toml file. - The pyproject.toml file should declare a dependency on nvidia-nat or another package with a name starting
with nvidia-nat-. This dependency should be declared using ~=<version>, and the version should be a two
digit version (ex: ~=1.0).

  • Not all packages contain Python code, if they do they should also contain their own set of tests, in a
    tests/ directory at the same level as the pyproject.toml file.

Files:

  • packages/nvidia_nat_crewai/src/nat/plugins/crewai/llm.py
  • packages/nvidia_nat_langchain/tests/test_llm_langchain.py
  • packages/nvidia_nat_crewai/tests/test_llm_crewai.py
  • packages/nvidia_nat_semantic_kernel/src/nat/plugins/semantic_kernel/llm.py
  • packages/nvidia_nat_llama_index/tests/test_llm_llama_index.py
  • packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py
  • packages/nvidia_nat_semantic_kernel/tests/test_llm_sk.py
  • packages/nvidia_nat_agno/src/nat/plugins/agno/llm.py
  • packages/nvidia_nat_llama_index/pyproject.toml
  • packages/nvidia_nat_agno/tests/test_llm_agno.py
  • packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py
packages/*/tests/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

If a package contains Python code, include tests in a tests/ directory at the same level as pyproject.toml

Files:

  • packages/nvidia_nat_langchain/tests/test_llm_langchain.py
  • packages/nvidia_nat_crewai/tests/test_llm_crewai.py
  • packages/nvidia_nat_llama_index/tests/test_llm_llama_index.py
  • packages/nvidia_nat_semantic_kernel/tests/test_llm_sk.py
  • packages/nvidia_nat_agno/tests/test_llm_agno.py
docs/source/**/*.md

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

docs/source/**/*.md: Use the official naming throughout documentation: first use “NVIDIA NeMo Agent toolkit”, subsequent “NeMo Agent toolkit”; never use deprecated names (Agent Intelligence toolkit, aiqtoolkit, AgentIQ, AIQ/aiq)
Documentation sources are Markdown files under docs/source; images belong in docs/source/_static
Keep docs in sync with code; documentation pipeline must pass Sphinx and link checks; avoid TODOs/FIXMEs/placeholders; avoid offensive/outdated terms; ensure spelling correctness
Do not use words listed in ci/vale/styles/config/vocabularies/nat/reject.txt; accepted terms in accept.txt are allowed

Files:

  • docs/source/workflows/about/reponses-api-and-agent.md
  • docs/source/workflows/about/index.md
docs/source/**/*

⚙️ CodeRabbit configuration file

This directory contains the source code for the documentation. All documentation should be written in Markdown format. Any image files should be placed in the docs/source/_static directory.

Files:

  • docs/source/workflows/about/reponses-api-and-agent.md
  • docs/source/workflows/about/index.md
**/*.{yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

In workflow/config YAML, set llms.._type: nat_test_llm to stub responses.

Files:

  • examples/agents/tool_calling/configs/config-responses-api.yml
**/configs/**

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Configuration files consumed by code must be stored next to that code in a configs/ folder

Files:

  • examples/agents/tool_calling/configs/config-responses-api.yml
tests/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Unit tests reside under tests/ and should use markers defined in pyproject.toml (e.g., integration)

Files:

  • tests/nat/agent/test_responses_api_agent.py

⚙️ CodeRabbit configuration file

tests/**/*.py: - Ensure that tests are comprehensive, cover edge cases, and validate the functionality of the code. - Test functions should be named using the test_ prefix, using snake_case. - Any frequently repeated code should be extracted into pytest fixtures. - Pytest fixtures should define the name argument when applying the pytest.fixture decorator. The fixture
function being decorated should be named using the fixture_ prefix, using snake_case. Example:
@pytest.fixture(name="my_fixture")
def fixture_my_fixture():
pass

Files:

  • tests/nat/agent/test_responses_api_agent.py
{tests/**/*.py,examples/*/tests/**/*.py}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

{tests/**/*.py,examples/*/tests/**/*.py}: Use pytest (with pytest-asyncio for async); name test files test_*.py; test functions start with test_; extract repeated code into fixtures; fixtures must set name in decorator and be named with fixture_ prefix
Mock external services with pytest_httpserver or unittest.mock; do not hit live endpoints
Mark expensive tests with @pytest.mark.slow or @pytest.mark.integration

Files:

  • tests/nat/agent/test_responses_api_agent.py
packages/*/pyproject.toml

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

packages/*/pyproject.toml: Each package must contain a pyproject.toml
In packages, declare a dependency on nvidia-nat or packages starting with nvidia-nat-
Use ~= version constraints (e.g., ~=1.0) for dependencies

Files:

  • packages/nvidia_nat_llama_index/pyproject.toml
{packages/*/pyproject.toml,uv.lock}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Add new dependencies to both pyproject.toml (alphabetically) and uv.lock via uv pip install --sync

Files:

  • packages/nvidia_nat_llama_index/pyproject.toml
🧠 Learnings (1)
📚 Learning: 2025-09-23T18:39:15.023Z
Learnt from: CR
PR: NVIDIA/NeMo-Agent-Toolkit#0
File: .cursor/rules/general.mdc:0-0
Timestamp: 2025-09-23T18:39:15.023Z
Learning: Applies to packages/*/pyproject.toml : In packages, declare a dependency on nvidia-nat or packages starting with nvidia-nat-

Applied to files:

  • packages/nvidia_nat_llama_index/pyproject.toml
🧬 Code graph analysis (12)
packages/nvidia_nat_crewai/src/nat/plugins/crewai/llm.py (1)
src/nat/utils/responses_api.py (1)
  • validate_no_responses_api (20-25)
packages/nvidia_nat_langchain/tests/test_llm_langchain.py (6)
src/nat/builder/builder.py (1)
  • Builder (68-290)
src/nat/builder/framework_enum.py (1)
  • LLMFrameworkEnum (19-25)
src/nat/llm/aws_bedrock_llm.py (1)
  • AWSBedrockModelConfig (33-63)
src/nat/llm/nim_llm.py (1)
  • NIMModelConfig (34-52)
src/nat/llm/openai_llm.py (2)
  • openai_llm (52-54)
  • OpenAIModelConfig (31-48)
packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py (1)
  • aws_bedrock_langchain (116-130)
src/nat/agent/responses_api_agent/register.py (6)
src/nat/builder/builder.py (1)
  • Builder (68-290)
src/nat/builder/framework_enum.py (1)
  • LLMFrameworkEnum (19-25)
src/nat/builder/function_info.py (2)
  • FunctionInfo (290-625)
  • from_fn (552-625)
src/nat/data_models/component_ref.py (2)
  • FunctionRef (94-102)
  • LLMRef (116-124)
src/nat/data_models/function.py (1)
  • FunctionBaseConfig (26-27)
src/nat/agent/tool_calling_agent/agent.py (2)
  • ToolCallAgentGraph (49-221)
  • ToolCallAgentGraphState (44-46)
src/nat/profiler/callbacks/llama_index_callback_handler.py (1)
src/nat/data_models/api_server.py (1)
  • ChatResponse (337-387)
tests/nat/agent/test_responses_api_agent.py (2)
src/nat/agent/responses_api_agent/register.py (2)
  • ResponsesAPIAgentWorkflowConfig (34-57)
  • responses_api_agent_workflow (61-127)
tests/conftest.py (1)
  • mock_tool (351-373)
packages/nvidia_nat_crewai/tests/test_llm_crewai.py (5)
src/nat/builder/builder.py (1)
  • Builder (68-290)
src/nat/builder/framework_enum.py (1)
  • LLMFrameworkEnum (19-25)
src/nat/llm/nim_llm.py (1)
  • NIMModelConfig (34-52)
src/nat/llm/openai_llm.py (2)
  • openai_llm (52-54)
  • OpenAIModelConfig (31-48)
packages/nvidia_nat_crewai/src/nat/plugins/crewai/llm.py (2)
  • nim_crewai (114-131)
  • openai_crewai (135-143)
packages/nvidia_nat_semantic_kernel/src/nat/plugins/semantic_kernel/llm.py (1)
src/nat/utils/responses_api.py (1)
  • validate_no_responses_api (20-25)
packages/nvidia_nat_llama_index/tests/test_llm_llama_index.py (6)
src/nat/builder/builder.py (1)
  • Builder (68-290)
src/nat/builder/framework_enum.py (1)
  • LLMFrameworkEnum (19-25)
src/nat/llm/aws_bedrock_llm.py (1)
  • AWSBedrockModelConfig (33-63)
src/nat/llm/nim_llm.py (1)
  • NIMModelConfig (34-52)
src/nat/llm/openai_llm.py (2)
  • openai_llm (52-54)
  • OpenAIModelConfig (31-48)
packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py (1)
  • aws_bedrock_llama_index (82-93)
packages/nvidia_nat_semantic_kernel/tests/test_llm_sk.py (4)
src/nat/builder/builder.py (1)
  • Builder (68-290)
src/nat/builder/framework_enum.py (1)
  • LLMFrameworkEnum (19-25)
src/nat/llm/openai_llm.py (2)
  • openai_llm (52-54)
  • OpenAIModelConfig (31-48)
packages/nvidia_nat_semantic_kernel/src/nat/plugins/semantic_kernel/llm.py (1)
  • openai_semantic_kernel (106-114)
packages/nvidia_nat_agno/src/nat/plugins/agno/llm.py (1)
src/nat/utils/responses_api.py (1)
  • validate_no_responses_api (20-25)
packages/nvidia_nat_agno/tests/test_llm_agno.py (4)
src/nat/llm/nim_llm.py (1)
  • NIMModelConfig (34-52)
packages/nvidia_nat_crewai/tests/test_llm_crewai.py (2)
  • mock_builder (40-41)
  • mock_builder (91-92)
packages/nvidia_nat_agno/src/nat/plugins/agno/llm.py (2)
  • nim_agno (81-95)
  • openai_agno (99-117)
src/nat/llm/openai_llm.py (1)
  • OpenAIModelConfig (31-48)
packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py (1)
src/nat/cli/entrypoint.py (1)
  • cli (84-102)
🪛 LanguageTool
examples/agents/tool_calling/README.md

[grammar] ~182-~182: There might be a mistake here.
Context: ...ol Calling with the OpenAI Responses API The NeMo Agent toolkit also provides an ...

(QB_NEW_EN)


[grammar] ~192-~192: There might be a mistake here.
Context: ...r the Responses API. #### Prerequisites - Set your OpenAI API key in the environme...

(QB_NEW_EN)


[grammar] ~231-~231: There might be a mistake here.
Context: ... - **nat_tools**: Tools implemented in NeMo Agent toolkit (for example, curre...

(QB_NEW_EN)


[grammar] ~279-~279: There might be a mistake here.
Context: ... the specific built-in tools you enable. - Some built-ins (for example, file search...

(QB_NEW_EN)


[grammar] ~281-~281: There might be a mistake here.
Context: ...cumentation for current requirements. - If tool calls error and `handle_tool_error...

(QB_NEW_EN)

docs/source/workflows/about/reponses-api-and-agent.md

[grammar] ~32-~32: There might be a mistake here.
Context: ...at_tools(executed by the agent graph). - **Agentic Workflow**: The_type: response...

(QB_NEW_EN)


[grammar] ~131-~131: There might be a mistake here.
Context: ...ders to include when calling the server.

(QB_NEW_EN)

🪛 Ruff (0.14.0)
src/nat/utils/responses_api.py

24-25: Avoid specifying long messages outside the exception class

(TRY003)

src/nat/agent/responses_api_agent/register.py

115-115: Consider moving this statement to an else block

(TRY300)


117-117: Redundant exception object included in logging.exception call

(TRY401)

src/nat/profiler/callbacks/llama_index_callback_handler.py

83-83: Do not catch blind exception: Exception

(BLE001)


200-201: try-except-pass detected, consider logging the exception

(S110)


200-200: Do not catch blind exception: Exception

(BLE001)

tests/nat/agent/test_responses_api_agent.py

30-30: Unused method argument: llm_name

(ARG002)


30-30: Unused method argument: wrapper_type

(ARG002)


34-34: Unused method argument: tool_names

(ARG002)


34-34: Unused method argument: wrapper_type

(ARG002)


43-43: Do not call setattr with a constant attribute value. It is not any safer than normal property access.

Replace setattr with assignment

(B010)


44-44: Do not call setattr with a constant attribute value. It is not any safer than normal property access.

Replace setattr with assignment

(B010)


46-46: Unused noqa directive (non-enabled: D401)

Remove unused noqa directive

(RUF100)


49-49: Do not call setattr with a constant attribute value. It is not any safer than normal property access.

Replace setattr with assignment

(B010)


50-50: Do not call setattr with a constant attribute value. It is not any safer than normal property access.

Replace setattr with assignment

(B010)


51-51: Do not call setattr with a constant attribute value. It is not any safer than normal property access.

Replace setattr with assignment

(B010)


54-54: Do not call setattr with a constant attribute value. It is not any safer than normal property access.

Replace setattr with assignment

(B010)


61-61: Do not call setattr with a constant attribute value. It is not any safer than normal property access.

Replace setattr with assignment

(B010)


62-62: Do not call setattr with a constant attribute value. It is not any safer than normal property access.

Replace setattr with assignment

(B010)


74-74: Loop control variable function_info not used within loop body

(B007)

src/nat/profiler/callbacks/langchain_callback_handler.py

54-54: Do not catch blind exception: Exception

(BLE001)


262-263: try-except-pass detected, consider logging the exception

(S110)


262-262: Do not catch blind exception: Exception

(BLE001)

packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py

87-88: Avoid specifying long messages outside the exception class

(TRY003)


102-103: Avoid specifying long messages outside the exception class

(TRY003)


116-117: Avoid specifying long messages outside the exception class

(TRY003)

packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py

121-122: Avoid specifying long messages outside the exception class

(TRY003)


139-140: Avoid specifying long messages outside the exception class

(TRY003)


153-154: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (21)
src/nat/eval/evaluate.py (1)

107-108: LGTM!

The token usage tracking correctly extends to include cached and reasoning tokens, maintaining consistency with the existing accumulation pattern.

packages/nvidia_nat_llama_index/pyproject.toml (1)

34-34: Verify the constraint change from ~= to >= is intentional.

The dependency constraint for llama-index-llms-openai changed from a compatible release constraint (~=) to a minimum version constraint (>=). This is less restrictive and could introduce breaking changes from future major versions.

If this change is intentional (e.g., to support multiple major versions), please confirm. Otherwise, consider using ~=0.4.2 to follow the pattern used by other dependencies and avoid potential compatibility issues.

Based on coding guidelines.

packages/nvidia_nat_agno/src/nat/plugins/agno/llm.py (1)

112-115: LGTM!

The conditional selection between OpenAIResponses and OpenAIChat based on api_type is correctly implemented and follows the expected pattern.

src/nat/profiler/callbacks/token_usage_base_model.py (1)

27-28: LGTM!

The new token fields are correctly implemented with appropriate types, defaults, and descriptions, maintaining consistency with the existing field pattern.

src/nat/eval/usage_stats.py (1)

24-25: LGTM!

The new token usage fields are correctly added to UsageStatsLLM, maintaining consistency with the token usage model and evaluation logic.

src/nat/data_models/llm.py (3)

17-18: LGTM!

The Enum import is correctly added to support the new APITypeEnum.


25-27: LGTM!

The APITypeEnum is correctly implemented as a string enum with clear, descriptive values for the two API types.


33-39: LGTM!

The api_type field is properly configured with:

  • Appropriate default value (CHAT_COMPLETION)
  • Clear description
  • Schema metadata for documentation
  • exclude=True to prevent serialization

The implementation follows Pydantic best practices.

examples/frameworks/multi_frameworks/pyproject.toml (1)

17-17: OpenAI dependency confirmed: openai~=1.86.0 exists on PyPI with no known security advisories.

src/nat/data_models/openai_mcp.py (1)

1-46: LGTM!

Clean Pydantic model definitions for OpenAI MCP tool configuration. The schema is well-documented with field descriptions, appropriate defaults, and correct use of use_enum_values=True for enum serialization.

src/nat/agent/responses_api_agent/__init__.py (1)

1-14: LGTM!

Standard package initialization file with proper Apache 2.0 license header.

examples/agents/tool_calling/README.md (1)

182-281: LGTM!

Comprehensive documentation for the Responses API agent integration. The section clearly explains the differences between Responses API and Chat Completions, provides setup instructions, configuration examples, and troubleshooting guidance. The code examples and YAML snippets are well-formatted and easy to follow.

Note: The LanguageTool warnings are false positives related to technical terminology (e.g., "NeMo Agent toolkit").

docs/source/workflows/about/index.md (1)

72-72: LGTM!

Adds the new Responses API agent to the workflows navigation menu, following the existing documentation structure.

src/nat/agent/register.py (1)

22-22: LGTM!

Properly registers the new Responses API agent workflow using the standard import alias pattern consistent with other agent workflows.

packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py (1)

130-137: LGTM!

Excellent implementation of conditional LLM client instantiation based on API type. The branching logic correctly instantiates OpenAIResponses for Responses API mode and falls back to standard OpenAI for Chat Completions, enabling proper support for both OpenAI API paradigms.

packages/nvidia_nat_agno/tests/test_llm_agno.py (2)

44-72: Good negative test for NIM RESPONSES path.

Validates rejection and ensures Nvidia client is not instantiated.


173-187: Good positive test for OpenAI RESPONSES path.

Confirms OpenAIResponses is selected and id forwarded.

src/nat/profiler/callbacks/langchain_callback_handler.py (1)

98-119: Token usage extraction: good addition for cached/reasoning tokens.

The extended fields (cached_tokens, reasoning_tokens) improve attribution and align with Responses API usage.

packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py (1)

170-179: ChatOpenAI flags compatible with pinned langchain-openai version

langchain-openai~=0.3.32 supports use_responses_api, use_previous_response_id (auto mode since ~0.3.26), and stream_usage; no runtime kwargs errors expected.

packages/nvidia_nat_semantic_kernel/tests/test_llm_sk.py (1)

49-56: LGTM: validates correct constructor args.

Patch target and asserted ai_model_id match the wrapper’s code path.

packages/nvidia_nat_llama_index/tests/test_llm_llama_index.py (1)

100-109: LGTM: Responses-API branch exercised correctly.

Verifies OpenAIResponses selected and kwarg passthrough (model, temperature).

Upgraded multiple Llama Index and related dependencies to their latest versions, ensuring compatibility and accessing new features. Added validation for "no responses API" in LLM configurations and modified API references to improve clarity and performance. Removed unused imports and outdated assertions for cleaner code and reduced redundancy.

Signed-off-by: dnandakumar-nv <[email protected]>
@dnandakumar-nv dnandakumar-nv requested a review from a team as a code owner October 16, 2025 14:52
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

♻️ Duplicate comments (1)
packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py (1)

101-103: Same TRY003 observation applies here.

These validation blocks follow the same pattern as AWS Bedrock (lines 86-88) and have the same optional TRY003 style suggestion.

Also applies to: 115-117

🧹 Nitpick comments (7)
packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py (2)

86-88: Validation logic is correct; consider extracting error messages (optional).

The API type validation correctly enforces that AWS Bedrock only supports CHAT_COMPLETION. The error message is clear and informative.

Static analysis flagged TRY003 (long message in exception), suggesting extraction into a custom exception class. However, given the straightforward error case and "Chill" review mode, this is a pedantic style suggestion that can be safely deferred.


128-133: Consider explicit validation in the else branch for consistency.

The conditional logic correctly instantiates OpenAIResponses when api_type == APITypeEnum.RESPONSES and falls back to OpenAI otherwise. However, the else branch doesn't explicitly validate that api_type is CHAT_COMPLETION, unlike the other providers (AWS Bedrock, Azure OpenAI, NVIDIA) which all enforce this check.

While the current implementation is safe given the two known enum values, adding explicit validation would improve consistency and make the code more defensive against future enum additions.

Consider applying this diff for consistency:

     if llm_config.api_type == APITypeEnum.RESPONSES:
         llm = OpenAIResponses(**llm_config.model_dump(exclude={"type", "thinking"}, by_alias=True, exclude_none=True))
+    elif llm_config.api_type == APITypeEnum.CHAT_COMPLETION:
+        llm = OpenAI(**llm_config.model_dump(exclude={"type", "thinking"}, by_alias=True, exclude_none=True))
     else:
-        llm = OpenAI(**llm_config.model_dump(exclude={"type", "thinking"}, by_alias=True, exclude_none=True))
+        raise ValueError(f"OpenAI only supports chat completion or responses API type. "
+                         f"Received: {llm_config.api_type}")
packages/nvidia_nat_agno/src/nat/plugins/agno/llm.py (1)

114-117: Consider explicit enum value handling for defensive coding.

The conditional logic correctly instantiates OpenAIResponses for APITypeEnum.RESPONSES and defaults to OpenAIChat otherwise. While this works for the current two-value enum, explicitly checking for APITypeEnum.CHAT_COMPLETION in an elif clause and raising an error for unexpected values would make the code more defensive against future enum additions.

Example refactor:

-    if llm_config.api_type == APITypeEnum.RESPONSES:
-        client = OpenAIResponses(**config_obj, id=llm_config.model_name)
-    else:
-        client = OpenAIChat(**config_obj, id=llm_config.model_name)
+    if llm_config.api_type == APITypeEnum.RESPONSES:
+        client = OpenAIResponses(**config_obj, id=llm_config.model_name)
+    elif llm_config.api_type == APITypeEnum.CHAT_COMPLETION:
+        client = OpenAIChat(**config_obj, id=llm_config.model_name)
+    else:
+        raise ValueError(f"Unsupported API type for OpenAI: {llm_config.api_type}")
tests/nat/agent/test_responses_api_agent.py (1)

81-88: Remove dead helper to keep tests lean (and avoid B007).

This helper isn’t used; delete to reduce noise.

-async def _consume_function_info(gen):
-    """Helper to consume a single yield from the async generator and return FunctionInfo."""
-    function_info = None
-    async for function_info in gen:
-        break
-    assert function_info is not None
-    return function_info
src/nat/agent/responses_api_agent/register.py (3)

35-39: Fix grammar/typos in config docstring.

-    """
-    Defines an NeMo Agent Toolkit function that uses a Responses API
-    Agent performs reasoning inbetween tool calls, and utilizes the
-    tool names and descriptions to select the optimal tool.
-    """
+    """
+    Defines a NeMo Agent Toolkit function that uses the Responses API.
+    The agent performs reasoning in between tool calls and utilizes tool
+    names and descriptions to select the optimal tool.
+    """

69-71: Consider raising a ValueError instead of assert for runtime validation.

Using assert can be stripped with -O; raising a typed error preserves validation in all modes. If you change this, update the test expecting AssertionError.

As per coding guidelines

-    llm: ChatOpenAI = await builder.get_llm(config.llm_name, wrapper_type=LLMFrameworkEnum.LANGCHAIN)
-    assert llm.use_responses_api, "Responses API Agent requires an LLM that supports the Responses API."
+    llm: ChatOpenAI = await builder.get_llm(config.llm_name, wrapper_type=LLMFrameworkEnum.LANGCHAIN)
+    if not getattr(llm, "use_responses_api", False):
+        raise ValueError("Responses API Agent requires an LLM that supports the Responses API.")

115-119: Simplify exception logging; avoid redundant exc_info and argument.

logger.exception already records the active exception; remove extra details and condense return.

-        except Exception as ex:
-            logger.exception("%s Tool Calling Agent failed with exception: %s", AGENT_LOG_PREFIX, ex, exc_info=ex)
-            if config.verbose:
-                return str(ex)
-            return "I seem to be having a problem."
+        except Exception as ex:
+            logger.exception("%s Tool Calling Agent failed", AGENT_LOG_PREFIX)
+            return str(ex) if config.verbose else "I seem to be having a problem."
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4fc8080 and d2563ee.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • examples/agents/tool_calling/README.md (2 hunks)
  • examples/frameworks/multi_frameworks/pyproject.toml (1 hunks)
  • packages/nvidia_nat_agno/src/nat/plugins/agno/llm.py (6 hunks)
  • packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py (6 hunks)
  • packages/nvidia_nat_langchain/tests/test_llm_langchain.py (1 hunks)
  • packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py (5 hunks)
  • src/nat/agent/responses_api_agent/register.py (1 hunks)
  • tests/nat/agent/test_responses_api_agent.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/nvidia_nat_langchain/tests/test_llm_langchain.py
🧰 Additional context used
📓 Path-based instructions (12)
**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

  • src/nat/agent/responses_api_agent/register.py
  • packages/nvidia_nat_agno/src/nat/plugins/agno/llm.py
  • packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py
  • tests/nat/agent/test_responses_api_agent.py
  • packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

  • src/nat/agent/responses_api_agent/register.py
  • packages/nvidia_nat_agno/src/nat/plugins/agno/llm.py
  • packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py
  • tests/nat/agent/test_responses_api_agent.py
  • packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py
src/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All importable Python code must live under src/ (or packages//src/)

Files:

  • src/nat/agent/responses_api_agent/register.py
src/nat/**/*

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Changes in src/nat should prioritize backward compatibility

Files:

  • src/nat/agent/responses_api_agent/register.py

⚙️ CodeRabbit configuration file

This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.

Files:

  • src/nat/agent/responses_api_agent/register.py
{src/**/*.py,packages/*/src/**/*.py}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All public APIs must have Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful

Files:

  • src/nat/agent/responses_api_agent/register.py
  • packages/nvidia_nat_agno/src/nat/plugins/agno/llm.py
  • packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py
  • packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

  • Confirm that copyright years are up-to date whenever a file is changed.

Files:

  • src/nat/agent/responses_api_agent/register.py
  • packages/nvidia_nat_agno/src/nat/plugins/agno/llm.py
  • packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py
  • tests/nat/agent/test_responses_api_agent.py
  • packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py
  • examples/frameworks/multi_frameworks/pyproject.toml
  • examples/agents/tool_calling/README.md
packages/*/src/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Importable Python code inside packages must live under packages//src/

Files:

  • packages/nvidia_nat_agno/src/nat/plugins/agno/llm.py
  • packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py
  • packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py
packages/**/*

⚙️ CodeRabbit configuration file

packages/**/*: - This directory contains optional plugin packages for the toolkit, each should contain a pyproject.toml file. - The pyproject.toml file should declare a dependency on nvidia-nat or another package with a name starting
with nvidia-nat-. This dependency should be declared using ~=<version>, and the version should be a two
digit version (ex: ~=1.0).

  • Not all packages contain Python code, if they do they should also contain their own set of tests, in a
    tests/ directory at the same level as the pyproject.toml file.

Files:

  • packages/nvidia_nat_agno/src/nat/plugins/agno/llm.py
  • packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py
  • packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py
tests/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Unit tests reside under tests/ and should use markers defined in pyproject.toml (e.g., integration)

Files:

  • tests/nat/agent/test_responses_api_agent.py

⚙️ CodeRabbit configuration file

tests/**/*.py: - Ensure that tests are comprehensive, cover edge cases, and validate the functionality of the code. - Test functions should be named using the test_ prefix, using snake_case. - Any frequently repeated code should be extracted into pytest fixtures. - Pytest fixtures should define the name argument when applying the pytest.fixture decorator. The fixture
function being decorated should be named using the fixture_ prefix, using snake_case. Example:
@pytest.fixture(name="my_fixture")
def fixture_my_fixture():
pass

Files:

  • tests/nat/agent/test_responses_api_agent.py
{tests/**/*.py,examples/*/tests/**/*.py}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

{tests/**/*.py,examples/*/tests/**/*.py}: Use pytest (with pytest-asyncio for async); name test files test_*.py; test functions start with test_; extract repeated code into fixtures; fixtures must set name in decorator and be named with fixture_ prefix
Mock external services with pytest_httpserver or unittest.mock; do not hit live endpoints
Mark expensive tests with @pytest.mark.slow or @pytest.mark.integration

Files:

  • tests/nat/agent/test_responses_api_agent.py
examples/**/*

⚙️ CodeRabbit configuration file

examples/**/*: - This directory contains example code and usage scenarios for the toolkit, at a minimum an example should
contain a README.md or file README.ipynb.

  • If an example contains Python code, it should be placed in a subdirectory named src/ and should
    contain a pyproject.toml file. Optionally, it might also contain scripts in a scripts/ directory.
  • If an example contains YAML files, they should be placed in a subdirectory named configs/. - If an example contains sample data files, they should be placed in a subdirectory named data/, and should
    be checked into git-lfs.

Files:

  • examples/frameworks/multi_frameworks/pyproject.toml
  • examples/agents/tool_calling/README.md
**/README.@(md|ipynb)

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Ensure READMEs follow the naming convention; avoid deprecated names; use “NeMo Agent Toolkit” (capital T) in headings

Files:

  • examples/agents/tool_calling/README.md
🧬 Code graph analysis (3)
src/nat/agent/responses_api_agent/register.py (7)
src/nat/builder/builder.py (1)
  • Builder (68-290)
src/nat/builder/framework_enum.py (1)
  • LLMFrameworkEnum (19-25)
src/nat/builder/function_info.py (2)
  • FunctionInfo (290-625)
  • from_fn (552-625)
src/nat/data_models/component_ref.py (2)
  • FunctionRef (94-102)
  • LLMRef (116-124)
src/nat/data_models/function.py (1)
  • FunctionBaseConfig (26-27)
src/nat/agent/tool_calling_agent/agent.py (2)
  • ToolCallAgentGraph (49-221)
  • ToolCallAgentGraphState (44-46)
tests/nat/agent/test_responses_api_agent.py (3)
  • get_llm (30-32)
  • get_tools (34-36)
  • bind_tools (46-62)
packages/nvidia_nat_agno/src/nat/plugins/agno/llm.py (1)
src/nat/utils/responses_api.py (1)
  • validate_no_responses_api (20-25)
tests/nat/agent/test_responses_api_agent.py (2)
src/nat/agent/responses_api_agent/register.py (2)
  • ResponsesAPIAgentWorkflowConfig (34-57)
  • responses_api_agent_workflow (61-126)
tests/conftest.py (1)
  • mock_tool (351-373)
🪛 LanguageTool
examples/agents/tool_calling/README.md

[grammar] ~182-~182: There might be a mistake here.
Context: ...ol Calling with the OpenAI Responses API The NeMo Agent toolkit also provides an ...

(QB_NEW_EN)


[grammar] ~192-~192: There might be a mistake here.
Context: ...r the Responses API. #### Prerequisites - Set your OpenAI API key in the environme...

(QB_NEW_EN)


[grammar] ~231-~231: There might be a mistake here.
Context: ... - **nat_tools**: Tools implemented in NeMo Agent toolkit (for example, curre...

(QB_NEW_EN)


[grammar] ~279-~279: There might be a mistake here.
Context: ... the specific built-in tools you enable. - Some built-ins (for example, file search...

(QB_NEW_EN)


[grammar] ~281-~281: There might be a mistake here.
Context: ...cumentation for current requirements. - If tool calls error and `handle_tool_error...

(QB_NEW_EN)

🪛 Ruff (0.14.0)
src/nat/agent/responses_api_agent/register.py

114-114: Consider moving this statement to an else block

(TRY300)


116-116: Redundant exception object included in logging.exception call

(TRY401)

packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py

119-120: Avoid specifying long messages outside the exception class

(TRY003)


137-138: Avoid specifying long messages outside the exception class

(TRY003)


151-152: Avoid specifying long messages outside the exception class

(TRY003)

tests/nat/agent/test_responses_api_agent.py

30-30: Unused method argument: llm_name

(ARG002)


30-30: Unused method argument: wrapper_type

(ARG002)


34-34: Unused method argument: tool_names

(ARG002)


34-34: Unused method argument: wrapper_type

(ARG002)


43-43: Do not call setattr with a constant attribute value. It is not any safer than normal property access.

Replace setattr with assignment

(B010)


44-44: Do not call setattr with a constant attribute value. It is not any safer than normal property access.

Replace setattr with assignment

(B010)


46-46: Unused noqa directive (non-enabled: D401)

Remove unused noqa directive

(RUF100)


56-56: Do not call setattr with a constant attribute value. It is not any safer than normal property access.

Replace setattr with assignment

(B010)


60-60: Do not call setattr with a constant attribute value. It is not any safer than normal property access.

Replace setattr with assignment

(B010)


61-61: Do not call setattr with a constant attribute value. It is not any safer than normal property access.

Replace setattr with assignment

(B010)


64-64: Do not call setattr with a constant attribute value. It is not any safer than normal property access.

Replace setattr with assignment

(B010)


71-71: Do not call setattr with a constant attribute value. It is not any safer than normal property access.

Replace setattr with assignment

(B010)


72-72: Do not call setattr with a constant attribute value. It is not any safer than normal property access.

Replace setattr with assignment

(B010)


84-84: Loop control variable function_info not used within loop body

(B007)

packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py

87-88: Avoid specifying long messages outside the exception class

(TRY003)


102-103: Avoid specifying long messages outside the exception class

(TRY003)


116-117: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (4)
packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py (1)

22-22: LGTM! Import added correctly.

The APITypeEnum import enables API-type-based branching throughout the LlamaIndex integration.

packages/nvidia_nat_agno/src/nat/plugins/agno/llm.py (3)

21-21: LGTM! Imports are now utilized.

Both imports are correctly used: APITypeEnum enables conditional client construction in openai_agno, and validate_no_responses_api enforces API type restrictions in nim_agno and litellm_agno. This resolves the previous review comment about the unused import.

Also applies to: 32-32


85-86: LGTM! Validation added as requested.

The validate_no_responses_api call correctly rejects Responses API usage for NIM before client instantiation, addressing the previous review comment. This ensures that attempts to use APITypeEnum.RESPONSES with NIM raise a ValueError as expected by the test suite.


127-128: LGTM! Validation correctly enforces API type restrictions.

The validate_no_responses_api call appropriately rejects Responses API usage for LiteLLM before client instantiation, consistent with the pattern used in nim_agno.

Corrected a typo in the documentation file name for clarity and accuracy. Ensuring proper naming maintains consistency and avoids potential confusion.

Signed-off-by: dnandakumar-nv <[email protected]>
Previously, tool_outputs supported a single ServerToolUseSchema. This change modifies it to handle a list, enhancing flexibility and enabling multiple tool uses to be tracked simultaneously.

Signed-off-by: dnandakumar-nv <[email protected]>
@dnandakumar-nv dnandakumar-nv removed the feature request New feature or request label Oct 16, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d2563ee and 9910b29.

📒 Files selected for processing (2)
  • docs/source/workflows/about/responses-api-and-agent.md (1 hunks)
  • src/nat/data_models/intermediate_step.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/nat/data_models/intermediate_step.py
🧰 Additional context used
📓 Path-based instructions (3)
docs/source/**/*.md

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

docs/source/**/*.md: Use the official naming throughout documentation: first use “NVIDIA NeMo Agent toolkit”, subsequent “NeMo Agent toolkit”; never use deprecated names (Agent Intelligence toolkit, aiqtoolkit, AgentIQ, AIQ/aiq)
Documentation sources are Markdown files under docs/source; images belong in docs/source/_static
Keep docs in sync with code; documentation pipeline must pass Sphinx and link checks; avoid TODOs/FIXMEs/placeholders; avoid offensive/outdated terms; ensure spelling correctness
Do not use words listed in ci/vale/styles/config/vocabularies/nat/reject.txt; accepted terms in accept.txt are allowed

Files:

  • docs/source/workflows/about/responses-api-and-agent.md
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

  • Confirm that copyright years are up-to date whenever a file is changed.

Files:

  • docs/source/workflows/about/responses-api-and-agent.md
docs/source/**/*

⚙️ CodeRabbit configuration file

This directory contains the source code for the documentation. All documentation should be written in Markdown format. Any image files should be placed in the docs/source/_static directory.

Files:

  • docs/source/workflows/about/responses-api-and-agent.md
🪛 LanguageTool
docs/source/workflows/about/responses-api-and-agent.md

[grammar] ~32-~32: There might be a mistake here.
Context: ...at_tools(executed by the agent graph). - **Agentic Workflow**: The_type: response...

(QB_NEW_EN)


[grammar] ~131-~131: There might be a mistake here.
Context: ...ders to include when calling the server.

(QB_NEW_EN)

Comment on lines +20 to +24
The NeMo Agent toolkit supports OpenAI's Responses API through two complementary pieces:

1) LLM client configuration via the `api_type` field, and 2) a dedicated workflow agent `_type: responses_api_agent` designed for tool use with the Responses API.

Unlike standard chat-based integrations, the Responses API enables models to use built-in tools (for example, Code Interpreter) and connect to remote tools using the Model Context Protocol (MCP). This page explains how to configure an LLM for Responses and how to use the dedicated agent.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix initial toolkit naming.

Per the docs guideline, the first mention must read “NVIDIA NeMo Agent toolkit”; switch the sentence to “The NVIDIA NeMo Agent toolkit supports…”. Later references can remain “NeMo Agent toolkit”.

🤖 Prompt for AI Agents
In docs/source/workflows/about/responses-api-and-agent.md around lines 20 to 24,
the initial toolkit name is incorrect; change the first sentence so the very
first mention reads "The NVIDIA NeMo Agent toolkit supports OpenAI's Responses
API..." (subsequent mentions can remain "NeMo Agent toolkit"). Ensure only the
first occurrence is updated to include "NVIDIA" and keep the rest of the
paragraph unchanged.

Copy link
Member

@willkill07 willkill07 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. just some nits on clarifying what is supported/unsupported

Comment on lines +33 to +39
api_type: APITypeEnum = Field(default=APITypeEnum.CHAT_COMPLETION,
description="The type of API to use for the LLM provider.",
json_schema_extra={
"enum": [e.value for e in APITypeEnum],
"examples": [e.value for e in APITypeEnum],
},
exclude=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you want exclude=True to work, but it doesn't.

Excluding means that the deep copying we do with configurations will strip this away.

Comment on lines +24 to +25
raise ValueError("Responses API is not supported for config %s. Please use a different API type.",
str(llm_config))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will dump out the entire configuration. Instead, maybe just print out the type?

Suggested change
raise ValueError("Responses API is not supported for config %s. Please use a different API type.",
str(llm_config))
raise ValueError(f"Responses API is not supported for config {str(type(llm_config))}. Please use a different API type.")

except Exception:
pass

# Append usage data to AIQ Toolkit usage stats
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Append usage data to AIQ Toolkit usage stats
# Append usage data to NAT usage stats

Comment on lines +103 to +112
cache_tokens = 0
reasoning_tokens = 0

if "input_token_details" in usage_metadata:
if "cache_read" in usage_metadata["input_token_details"]:
cache_tokens = usage_metadata["input_token_details"]["cache_read"]

if "output_token_details" in usage_metadata:
if "reasoning" in usage_metadata["output_token_details"]:
reasoning_tokens = usage_metadata["output_token_details"]["reasoning"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit

Suggested change
cache_tokens = 0
reasoning_tokens = 0
if "input_token_details" in usage_metadata:
if "cache_read" in usage_metadata["input_token_details"]:
cache_tokens = usage_metadata["input_token_details"]["cache_read"]
if "output_token_details" in usage_metadata:
if "reasoning" in usage_metadata["output_token_details"]:
reasoning_tokens = usage_metadata["output_token_details"]["reasoning"]
cache_tokens = usage_metadata.get("input_token_details", {}).get("cache_read", 0)
reasoning_tokens = usage_metadata.get("output_token_details", {}).get("reasoning", 0)

Comment on lines +86 to +88
if llm_config.api_type != APITypeEnum.CHAT_COMPLETION:
raise ValueError("AWS Bedrock LLM only supports chat completion API type. "
f"Received: {llm_config.api_type}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use/prefer validate_no_responses_api ?

Comment on lines +150 to +152
if llm_config.api_type != APITypeEnum.CHAT_COMPLETION:
raise ValueError("NVIDIA AI Endpoints only supports chat completion API type. "
f"Received: {llm_config.api_type}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about openai/gpt-oss-120b ?
https://build.nvidia.com/openai/gpt-oss-120b/modelcard

or is this because langchain-nvidia doesn't?

Comment on lines +118 to +121
if llm_config.api_type != APITypeEnum.CHAT_COMPLETION:
raise ValueError("AWS Bedrock only supports chat completion API type. "
f"Received: {llm_config.api_type}")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer validate_no_responses_api ?

from nat.data_models.llm import APITypeEnum


def validate_no_responses_api(llm_config):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should also accept the framework enum for a clearer message.

"llama-index-llms-nvidia~=0.3.1",
"llama-index-llms-openai~=0.3.42",
"llama-index-llms-nvidia~=0.3.4",
"llama-index-llms-openai>=0.4.2",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To maintain ~= like semantics

Suggested change
"llama-index-llms-openai>=0.4.2",
"llama-index-llms-openai>=0.4.2,<1.0.0",

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improvement to existing functionality non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants